Adaptive coordinate, real-space electronic structure 
calculations on parallel computers 
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We present a method for electronic structure calculations that retains all of the advantages of real 
space and addresses the inherent inefficiency of a regular grid, which has equal precision everywhere. 
The computations are carried out on a regular mesh in curvilinear space, which allows natural and 
efficient decomposition on parallel computers, and effective use of iterative numerical methods. A 
novel feature is the use of error analysis to optimize the curvilinear grid for highly inhomogeneous 
electronic distributions. We report accurate all-electron calculations for H2, O, and O2. 



Ab initio electronic structure calculations are compu- 
tationally very challenging because the singular Coulomb 
potential of the ions results in highly localized core wave- 
functions with cusps at the ionic positions. Even when 
pseudopotentials are used to eliminate the core elec- 
trons, it is often desirable to treat valence electrons with 
highly localized wavefunctions (e.g., Is, 2p, 3d, or 4f va- 
lence electrons), on the same footing as delocalized ones. 
Typical implementations use a basis in a (one-particle) 
Hilbert space, the choice of which requires a tradeoff be- 
tween simplicity and fast convergence of physical quan- 
tities with the basis size. The simplest basis consists of 
plane waves. Its main drawback is uniform precision, 
leading to slow convergence for inhomogeneous systems 
like atoms, molecules, clusters, or solid surfaces. On the 
other hand, bases such as linearized augmented plane 
waves (LAPW) or muffin tin orbitals (LMTO) can be 
tailored to specific physical problems and therefore have 
excellent convergence properties. However, they lead to 
very complex equations. A promising alternative is a real 
space approach. All terms are local except for the Lapla- 
cian, which has a very short range. The resulting sparse 
Hamiltonian allows effective use of iterative algorithms, 
which vastly reduce both memory and time requirements, 
and is a prerequisite to any O(N) treatment of electronic 
structure. 

Harnessing the computational power of massively par- 
allel architectures imposes additional constraints on the 
choice of basis. To achieve good load balance, compu- 
tational complexity and memory requirements must be 
evenly divided among processors, a task made very diffi- 
cult by complex bases like LAPW and LMTO. Another 
important consideration is the minimization and localiza- 
tion of communication between processors. Since Fourier 
transforms (the underlying operations in a plane wave ba- 
sis) require communication between all processors, even 
plane waves are not an efficient basis in this respect. In 
contrast, a regular grid in real space is a very natural 
choice for a massively parallel computer architecture: as- 
signing an equal section of the grid to each processor pro- 
vides good load balance, minimizes interprocessor com- 
munication, and produces communication patterns that 
are both local and conflict-free. 



Chelikowsky and collaborators Q have reported real 
space electronic structure computations using a regular 
grid. Recently, Briggs et al. have used multi-grid ac- 
celeration to improve efficiency. Yet, a regular grid in 
real space suffers from the same drawbacks as a plane 
wave basis, i.e., it has the same resolution in every re- 
gion of space. Attempts to circumvent this problem have 
been pursued by Cho et al. || and by Wei et al. 0], us- 
ing wavelets as a basis. Another approach investigated 
by Tsuchida et al. || uses finite-elements with a non- 
uniform grid. Finally, Bylaska et al. || have reported cal- 
culations using multi-grid methods to enhance precision 
locally. Irrespective of whether the formalism is based on 
wavelets, finite-elements, or multi-grids, enhancing the 
resolution by locally adding more basis elements ruins 
the natural mapping onto a parallel architecture. 

Progress toward overcoming the limitations of plane 
wave bases has also been reported recently. Gygi 
introduced the concept of adapted plane waves, a dis- 
tortion of Fourier space that allows treatment of physi- 
cal space with different degrees of precision. Following 
that development, Hamann |^ and Devenyi et al. || re- 
ported calculations using similar approaches. The adap- 
tive plane wave approach eliminates the major drawback 
of the standard plane wave basis, but lacks the simplicity, 
sparseness, and natural parallelization properties of real 
space algorithms. 

In the present work, we combine the advantages of real 
space calculations and those of adaptive coordinates in 
a scheme for performing electronic structure calculations 
in the context of density functional theory and the local 
density approximation (DFT/LDA) |l(J, or the gener- 
alized gradient approximation Our scheme is effi- 
cient and accurate for systems with very inhomogeneous 
charge distributions and takes full advantage of massively 
parallel computer architectures. The central idea is to 
work on a regular grid, but in a curvilinear space £. The 
change of coordinates x(£), generates a single grid in real 
space x, which is finer where high precision is needed. We 
refer to our method as the Adaptive Coordinate, Real- 
space, Electronic Structure (ACRES) algorithm. It em- 
bodies the following advantageous features: 
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(1) It can achieve an essentially optimal distribution 
of grid points. This is accomplished by a versatile choice 
of the curvilinear coordinates, through which a grid of 
fixed size is adapted to provide resolution commensurate 
with the physics. The only cost of the adaptation is the 
introduction of a nontrivial metric <? a ^(£)- 

(2) The coordinate transform is chosen so as to min- 
imize the discretization error. The idea is to determine 
an a priori good set of curvilinear coordinates through 
the use of error analysis. This differs from Gygi's original 
approach |jj in the adaptive plane wave scheme, where 
the best possible change of coordinates is found through 
an energy minimization. 

(3) Since the communications pattern remains exactly 
the same as that of a regular grid in real space, highly 
efficient parallelization is trivially accomplished. 

(4) The sparsity of the equations allows us to take ad- 
vantage of iterative algorithms. This makes it possible 
to employ rather large grids, and consequently to inves- 
tigate complex systems. The computational time scales 
as N x n e with N the total number of points in the 3- 
dimensional grid and n e the number of electrons in the 
system. 

We will now describe the method in more detail. The 
real space coordinates x l (£ a ;P m ) depend on the curvi- 
linear coordinates and on some set of parameters P m 
that allow us to tune the change of coordinates to a par- 
ticular problem. The Jacobian of the transformation is 

J i a (t-,P) = dx i /df a (1) 

with | J | = det J its determinant. The trivial met- 
ric g IJ = Sij in real space corresponds to the metric 

g a P = J -1 " J — *f in curvilinear coordinates (summation 
over repeated indices implied). The Laplacian operator 
in curvilinear space is 

A = j^d a (\J\g al3 d ) , (2) 

and the integrals are transformed according to J d 3 x = 
J d 3 £ \J\. The Coulomb potential is found by solving the 
Poisson equation [discretized in curvilinear coordinates 
by means of the Laplacian, Eq. (0)] with the sum of the 
electronic and nuclear charge as the source. 

The equations arc discretized in a box of linear size Aj , 
using a finite difference scheme on a regular grid in curvi- 
linear space £ with Ni points in each direction |L^| . Any 
boundary conditions, including the phase shifts required 
to do multiple fc-point calculations for solids, can easily 
be implemented in this approach. In the following, we 
use periodic boundary conditions. 

Implementation of the method presents certain chal- 
lenges due to the freedom in choosing discretization 
schemes. The most important ones, and the manner in 
which we resolved them, are discussed here briefly: 

(a) Equations that are equivalent in the continuum 
limit are not necessarily equivalent after discretization. 



For example, there exist several expressions for the Lapla- 
cian which are equivalent in the continuum limit. From 
physical and computational considerations, it is desirable 
to have a self-adjoint discretization of the Laplacian. The 
expression given in Eq. (^J) is self-adjoint after discretiza- 
tion if, for a fixed pair of indices (a,/3), the finite differ- 
ence operators used to represent d a and dp are identical. 

(b) The order of the finite difference approximation for 
the derivatives is very important. The lowest order, two- 
point symmetric derivative is insufficient and does not 
give good results. Our experience indicates that we need 
to use a symmetric discretization for the derivatives with 
at least four points (second order). 

(c) The discrete representation of the nuclear charge 
is equally important. For an atom with atomic number 
Z at position R, the nuclear charge is p(£) = ZS^R) 
where <5(£; R) is a representation of a Dirac 8 function at 
R on the regular grid in £ space. Beside the normalization 
condition on the 8 function, an important constraint on 
its representation on a finite grid is that the first moment 
of the distribution must correspond to the location of the 
8 function, i.e., 

J d£\J\5&R)2$=M (3) 

We found the most useful representation to be a Gaussian 

6(i; R) <x exp - £o| 2 / 2a 2 Af) (4) 

with A£ the regular grid spacing, a an adjustable pa- 
rameter, and £o chosen to satisfy the constraint on the 
first moment of the distribution. This choice reduces the 
translational invariance problems discussed in the next 
point. 

(d) The presence of the grid breaks translational in- 
variance. We call the distance between an atomic center 
and the nearest grid point the offset. The energy de- 
pends on the offset, and this effect can be quite large due 
to the Coulomb singularity. The dependence is reduced 
by strong adaptation, which makes the cell of the real 
space grid very small near the atomic sites. The Gaus- 
sian representation of the 8 function further reduces the 
dependence of the energy on the offset because it results 
in a smoother transfer of charge as the position of an 
atom changes. The combined use of strong adaptation 
and a Gaussian 8 function eliminates the translation in- 
variance problem. 

(e) A final challenge is the actual choice of curvilin- 
ear coordinates. A necessary condition for the mapping 
between x and £ is that it must be one to one, i.e., the 
grid in x space must not be folded. As the Laplacian 
involves the derivative of the metric, and the metric is 
computed from the Jacobian, the mapping must be at 
least C 2 on the torus in order to ensure smoothness. It 
is also desirable that the mapping be spherically sym- 
metric around an atom. We use a two level coordinate 
transformation a?(£;P) with a global backdrop useful for 
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simulating isolated structures, and further local adapta- 
tion around each atom position. The global backdrop 
is a simple independent transformation along each axis 
x 1 = and creates a flat central region with a high 

density of grid points and a surrounding region with a 
decreasing density of grid points. The local adaptation 
creates a spherical deformation of the grid around each 
atomic center with the amount of adaptation A v and 
the size of the adapted region p u as independent vari- 
ables. As suggested by Gygi 0, for a given det J(R U ) 
and p v , the final results are not very sensitive to the de- 
tails of the formula for 2?(£). The computations presented 
below were carried out with the simple form for the local 
adaptation 

^ p ) - £ *o 

(5) 

with the function a(A, p) chosen such that p gives the 
real space width of the adapted region. 

A question of central importance is how to choose the 
different parameters of the grid so as to generate a nearly 
optimal mesh for a given physical problem. We resolve 
this issue by constructing an estimate for the error in the 
integrals and then choosing the parameters that minimize 
the error. To illustrate this point, consider a periodic, 
one-dimensional integral 1(f) = f d£f(Q computed nu- 
merically on a regular mesh 

i N (f) = j2^um (6) 

i 

with A£ = A/N. We evaluate the elementary error 
by comparing the integrals computed with N and N/2 
points [^3|. More precisely, with N/2 points, the rect- 
angular element of integration is Wjv/2 = 2A£/(£j). In 
comparison, the same element of integration computed 
with N points is 5I N = A£ [/(&_i)/2+/(&)+/(& + i)/2]. 
An elementary estimate of the error is given by 

5e(f) = SI N - SI N/2 = Ae f»/2. (7) 

If the constant 1/2 on the right hand side is replaced by 
1/12 we obtain a rigorous upper bound due to Peano juj . 
The error in the numerical integral is then estimated by 
the Li norm of 8e 

'(f) -5® W2 (l>« «"> 2 ) ■ < 8 » 

The above idea is easily generalizable to three- 
dimensional integrals (whereas the rigorous Peano bound 
is difficult to extend to higher dimensions) . The last step 
is to pick an integrand / so as to obtain an a priori esti- 
mate of the optimal grid parameters by minimizing e(f). 
By experimenting on several atoms we have found that 



/ = \ J\ pVx-s provides an adequate indicator. Due to 
large cancellations forced by the Kohn-Sham eigenvalue 
equation, this term gives the leading factor for the error 
in the total energy. 

Using the approach described in this paper, we have 
implemented DFT/LDA @ and DFT/GGA Q elec- 
tronic structure calculations on the CM-5 massively par- 
allel supercomputer. Within this approach, all-electron 
computations involving atoms in the first row of the 
periodic table are feasible. We have also implemented 
the pseudopotential approach, using the norm-conserving 
nonlocal pseudopotentials of Bachelet et al. |^6|, and 
the Klcinman-Bylandcr procedure to render the nonlo- 
cal components separable JlTj . 

For the all-electron calculations, the adaptation of the 
grid is determined by the requirement that the density 
of grid points near the atomic cores is sufficient to accu- 
rately represent the 1/r divergence of the Coulomb po- 
tential. For example, Fig. |l| shows a grid used for the H 2 
molecule calculation. This clearly indicates the very large 
difference between the spacing of grid points in the unoc- 
cupied vacuum region and near the atomic nuclei. Fig. ^ 
shows the occupied wave functions of the O2 molecule 
along a line through the centers of the two atoms. The 
enhancement of the grid resolution throughout the re- 
gions where the electronic wave functions are large and 
the very strong enhancement close to the nuclei allow 
accurate representation of the smooth tails of the wave 
functions as well as the cusps and nodes near the nuclei. 

For a more quantitative comparison to other theoret- 
ical results and to experiment, Table || shows our calcu- 
lated results for H2 and O. It is clear from this compari- 
son that our results are in complete agreement with other 
theoretical work using similar methods. Therefore, the 
difference between calculated values and experimental 
measurements reflects fundamental limitations of the un- 
derlying theory (DFT/LDA or DFT/GGA), rather than 
limitations in the accuracy of our method. 

The authors are grateful to F. Gygi for sharing his 
insight on adaptive methods. This work was supported 
by the Office of Naval Research grant N00014-93-1-0190. 
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TABLE I. Calculated bond length ao, vibrational fre- 
quency ui, and minimum energy Eq for H2 and atomic en- 
ergy Eat of O. The zero-point vibrational energy has been 
subtracted from the experimental total energy of H2. 





ACRES 


Other DFT Theory 


Experiment 


H 2 [LDA] 








a (a.u.) 


1.448 


1.446 a 


1.401 


/ — 1 \ 
to (cm ) 


4192 


4207 a 


4401 b 


E (Ry) 


-2.276 


— 2.27 c 


—2. 349" 


H 2 [GGA] 








a (a.u.) 


1.416 


1.413 a 


1.401 6 


lo (cm -1 ) 


4381 


4373 a 


4401" 


E (Ry) 


-2.340 


-2.34 c 


-2.349 b 


O [LDA] 








Eat (Ry) 


-148.870 


-148.938 d 


-150.027 e 


O [GGA] 








Eat (Ry) 


-149.912 


-149.994 d 


-150.027 e 



a Reference jig], S-VWN and B-LYP 
b Reference 1| 

c Reference 1 20 , LSD and PW GGA-II 
d Reference [21], LDA and PW91 

e Reference [22|, Spin unpolarized ground state, 2p 4 1 D 



FIG. 1. The 24 x 12 x 12 [a.u.] grid used for H 2 , in a 
horizontal cross-section through the atoms (every fourth line 
shown). Notice the effect of the global backdrop (crosslike 
pattern) and the local adaptation around each atom. 



FIG. 2. Occupied wave functions of the O2 molecule, along 
a line through the centers of the atoms. The n bonding and 
anti-bonding wave functions collapse onto the horizontal axis 
(they have nodes through the atomic centers). The Is bond- 
ing and anti-bonding states were scaled by a factor of 1/3 
so they could be displayed on the same scale. Points on the 
curves indicate values at actual grid points used in the calcu- 
lation. 



4 



3 




